Restoring the Duality between Principal Components of a Distance Matrix and Linear Combinations of Predictors, with Application to Studies of the Microbiome

نویسندگان

  • Glen A Satten
  • Robert E Tyx
  • Angel J Rivera
  • Stephen Stanfill
چکیده

Appreciation of the importance of the microbiome is increasing, as sequencing technology has made it possible to ascertain the microbial content of a variety of samples. Studies that sequence the 16S rRNA gene, ubiquitous in and nearly exclusive to bacteria, have proliferated in the medical literature. After sequences are binned into operational taxonomic units (OTUs) or species, data from these studies are summarized in a data matrix with the observed counts from each OTU for each sample. Analysis often reduces these data further to a matrix of pairwise distances or dissimilarities; plotting the first two or three principal components (PCs) of this distance matrix often reveals meaningful groupings in the data. However, once the distance matrix is calculated, it is no longer clear which OTUs or species are important to the observed clustering; further, the PCs are hard to interpret and cannot be calculated for subsequent observations. We show how to construct approximate decompositions of the data matrix that pair PCs with linear combinations of OTU or species frequencies, and show how these decompositions can be used to construct biplots, select important OTUs and partition the variability in the data matrix into contributions corresponding to PCs of an arbitrary distance or dissimilarity matrix. To illustrate our approach, we conduct an analysis of the bacteria found in 45 smokeless tobacco samples.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Relationship between predictors of visual preference and restorative components of the urban natural landscapes

The role of scenic landscapes in restoring human mental fatigue has begun to receive attention from landscape researchers. However, little is known about the positive role of visually-preferred landscapes on restorative environments and in improving the mental fatigue of people. This study attempted to determine the relationship between the predictors of preference and restorative components of...

متن کامل

Next Generation Sequencing and its Application in the Study of Microbiome in Plant Diseases Suppressive Soils

Progress in next-generation sequencing has played a significant role in ecological studies of microbial populations. These advances have led to a rapid evaluation in metagenomics studies (analysis of DNA of microbial communities without the need to culture). Many statistical and computational tools and metagenomics databases have led to the discovery of huge amounts of data. In this research, i...

متن کامل

Application of multivariate techniques in-line with spatial regionalization of AOD over Iran

Application of multivariate techniques in-line with spatial regionalization of AOD over Iran Introduction Models, satellites and terrestrial datasets have been used to detect and characterize aerosol. Nontheless, micoscale classification using remote sensing parameters considers as a deficiency. Thus, regionalizion and modeling aerosol without regard to political boundaries or a specific s...

متن کامل

Application of Spectral Analysis in Mapping Hydrothermal Alteration of the Northwestern Part of the Kerman Cenozoic Magmatic Arc, Iran

The northwestern part of the Kerman Cenozoic magmatic arc (KCMA) contains many areas with porphyry copper mineralization. In this research, we used the advanced space-borne thermal emission and reflection radiometer (ASTER) and Enhanced Thematic Mapper plus (ETM+) images of this region to map the distribution of hydrothermally altered rocks, based on their mineral assemblages. The spectral meas...

متن کامل

Separation Between Anomalous Targets and Background Based on the Decomposition of Reduced Dimension Hyperspectral Image

The application of anomaly detection has been given a special place among the different   processings of hyperspectral images. Nowadays, many of the methods only use background information to detect between anomaly pixels and background. Due to noise and the presence of anomaly pixels in the background, the assumption of the specific statistical distribution of the background, as well as the co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 12  شماره 

صفحات  -

تاریخ انتشار 2017